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Abstract 

An  emerging  model  for  computational  grids  intercon¬ 
nects  similar  multi-resource  servers  from  distributed  sites. 
A  job  submitted  to  the  grid  can  be  executed  by  any  of  the 
servers;  however,  resource  size  or  balance  may  be  differ¬ 
ent  across  servers.  One  approach  to  resource  management 
for  this  grid  is  to  layer  a  global  load  distribution  system  on 
top  of  the  local  job  management  systems  at  each  site.  Un¬ 
fortunately,  classical  load  distribution  policies  fail  on  two 
aspects  when  applied  to  a  multi-resource  server  grid.  First, 
simple  load  indices  may  not  recognize  that  a  resource  im¬ 
balance  exists  at  a  server.  Second,  classical  job  selection 
policies  do  not  actively  correct  such  a  resource  imbalanced 
state.  We  show  through  simulation  that  new  policies  based 
on  resource  balancing  perform  consistently  better  than  the 
classical  load  distribution  strategies. 


1.  Introduction 

An  emerging  model  in  high  performance  supercomput¬ 
ing  is  to  interconnect  similar  computing  systems  from  ge¬ 
ographically  remote  sites,  creating  a  near-homogeneous 
computational  grid  system.  Computing  systems,  or  servers, 
are  homogeneous  in  that  any  job  submitted  to  the  grid  may 
be  sent  to  any  server  for  execution.  However,  the  servers 
may  be  heterogeneous  with  respect  to  their  exact  resource 
configurations.  For  example,  the  first  phase  of  the  NASA 

*This  work  was  supported  by  NASA  grant  NCC2-5268  and  contract 
NAS 2- 143 03,  and  by  Army  High  Performance  Computing  Research  Cen¬ 
ter  (AHPCRC)  cooperative  agreement  DAAH04-95-2-0003  and  contract 
DAAH04-95-C-0008.  Access  to  computing  facilities  was  provided  by  AH¬ 
PCRC,  Minnesota  Supercomputer  Institute. 


Metacenter  linked  a  42-node  IBM  SP2  at  Langley  and  a 
144-node  SP2  at  Ames  [7].  The  two  servers  were  homo¬ 
geneous  in  that  they  were  both  IBM  SP2s,  with  identical  or 
synchronized  software  configurations.  However,  they  were 
heterogeneous  on  two  counts:  the  number  of  nodes  in  each 
server,  and  the  fact  that  the  Langley  machine  consisted  of 
thin  nodes  while  the  Ames  machine  had  wide  nodes.  A  job 
could  be  executed  by  either  server  without  modifications, 
provided  a  sufficient  number  of  nodes  were  available  on  that 
server. 

The  resource  manager  for  the  near-homogeneous  grid 
system  is  responsible  for  scheduling  submitted  jobs  to  avail¬ 
able  resources  such  that  some  global  objective  is  satisfied, 
subject  to  the  constraints  imposed  by  the  local  policies  at 
each  site.  One  approach  to  resource  management  for  near- 
homogeneous  computational  grids  is  to  provide  a  global 
load  distribution  system  (LDS)  layered  on  top  of  the  local 
job  management  system  (JMS)  at  each  site.  This  architec¬ 
ture  is  depicted  in  Figure  1.  The  compute  server  at  each 
site  is  managed  by  a  local  JMS.  Users  submit  jobs  directly 
to  their  local  JMS  which  places  the  jobs  in  wait  queues  un¬ 
til  sufficient  resources  are  available  on  the  local  compute 
server.  The  global  LDS  monitors  the  load  at  each  site.  In 
the  event  that  some  sites  become  heavily  loaded  while  other 
sites  are  lightly  loaded,  the  LDS  attempts  to  equalize  the 
load  across  all  serves  by  moving  jobs  among  the  sites.  The 
JMS  at  each  site  is  then  responsible  for  the  detailed  allo¬ 
cation  and  scheduling  of  local  resources  to  jobs  submitted 
directly  to  it,  as  well  as  to  jobs  which  are  assigned  to  it  by 
the  global  LDS.  The  local  JMS  also  provides  load  status 
to  the  LDS  to  support  load  distribution  decisions,  as  well 
as  a  scheduling  Applications  Programming  Interface  (API) 
to  implement  these  decisions.  For  example,  in  the  NASA 
Metacenter,  a  peer-aware  receiver-initiated  load  balancing 


Figure  1.  Near-Homogeneous  Metacomputing  Resource  Management  Architecture 


algorithm  was  used  to  move  work  from  one  IBM  SP2  to 
the  other.  When  the  workload  on  one  SP2  dropped  below 
a  specified  threshold,  the  peer-aware  load  balancing  mech¬ 
anism  would  query  the  other  SP2  to  see  if  it  had  any  work 
which  could  be  transferred  for  execution. 

The  architecture  depicted  in  Figure  1  is  conceptually 
identical  to  classical  load  balancing  in  a  parallel  or  dis¬ 
tributed  computer  with  two  notable  exceptions.  First,  the 
compute  server  at  each  site  may  be  a  complex  combina¬ 
tion  of  multiple  types  of  resources  (CPUS,  memory,  disks, 
switches,  and  so  on).  Similarly,  the  applications  submit¬ 
ted  by  the  users  are  described  by  multiple  resource  re¬ 
quirements.  We  generalize  these  notions  and  define  a 
A'-resource  server  and  corresponding  /^-requirement  job. 
Each  server  Si  has  K  resources,  S° ,  Sj, S*-1.  Each 
job  Jj  is  described  by  its  requirements  for  each  resource 
type,  ■Jj<  1 .  Note  that  the  servers  are  still  con¬ 

sidered  homogeneous  from  the  jobs’  perspective,  as  any  job 
may  be  sent  to  any  server  for  execution. 


The  second  exception  is  that  the  physical  configura¬ 
tions  of  the  K  resources  for  each  server  may  be  heteroge¬ 
neous.  This  heterogeneity  can  be  manifested  in  two  ways. 
The  amount  of  a  given  resource  at  one  server  site  may  be 
quite  different  than  the  configuration  of  a  server  at  another 
site.  For  example,  server  S,  may  have  more  memory  than 
server  Sj.  Additionally,  servers  may  have  a  different  bal¬ 
ance  of  each  resource.  For  example,  one  server  may  have 
a  (relatively)  large  memory  with  respect  to  its  number  of 
CPUs  while  another  server  may  have  a  large  number  of 
CPUs  with  less  memory. 

Classical  load  balancing  attempts  to  maximize  system 
throughput  by  keeping  all  processors  busy.  We  extend  this 
notional  goal  to  fully  utilizing  all  K  resources  at  each  site. 
One  heuristic  for  achieving  this  objective  is  to  match  the  job 
mix  at  each  server  with  the  capabilities  of  that  server,  in  ad¬ 
dition  to  balancing  the  load  across  servers.  For  example,  if 
a  server  has  a  large  shared  memory,  then  the  job  mix  in  the 
local  wait  queue  should  be  adjusted  by  the  global  EDS  to 
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contain  jobs  which  are  generally  memory  intensive.  Com¬ 
pute  intensive  jobs  should  be  moved  to  a  server  which  has 
a  relatively  large  number  of  CPUs  with  respect  to  its  avail¬ 
able  memory.  The  goal  of  the  LDS  is  to  therefore  balance 
the  total  resource  demand  among  all  sites,  for  each  type  of 
resource. 

This  work  investigates  the  use  of  load  balancing  tech¬ 
niques  to  solve  the  global  load  distribution  problem  for 
computational  grids  consisting  of  near-homogeneous  multi¬ 
resource  servers.  The  complexity  of  multi-resource  com¬ 
pute  servers  along  with  the  multi-resource  requirements  of 
the  jobs  cause  the  methods  developed  in  past  load  balanc¬ 
ing  research  to  fail  in  at  least  two  areas.  First,  the  defini¬ 
tion  of  the  load  at  a  given  server  is  not  easily  described  by 
a  single  load  index.  Specifically,  a  resource  imbalance,  in 
which  the  local  job  mix  does  not  match  the  capabilities  of 
the  local  server,  is  not  directly  detectable.  This  impacts  the 
ability  of  the  global  LDS  to  match  the  workload  at  a  site 
to  the  capabilities  of  the  site.  We  propose  a  simple  exten¬ 
sion  to  a  classical  load  index  measure  based  on  a  resource 
balancing  heuristic  to  provide  this  additional  level  of  de¬ 
scriptive  detail.  Second,  once  a  resource  imbalance  is  de¬ 
tected,  existing  approaches  to  selecting  which  jobs  to  move 
between  servers  fail  to  actively  correct  the  problem.  We 
provide  an  analogous  job  selection  policy,  also  based  on  re¬ 
source  balancing,  which  heuristically  corrects  the  resource 
imbalance.  The  combination  of  these  two  extensions  pro¬ 
vides  the  framework  for  a  global  LDS  which  consistently 
outperforms  existing  approaches  over  a  wide  range  of  com¬ 
pute  server  characteristics. 

The  remainder  of  this  paper  is  organized  as  follows.  Sec¬ 
tion  2  provides  an  overview  of  relevant  past  research,  con¬ 
cluding  with  variants  of  a  baseline  load  balancing  algorithm 
drawn  from  the  literature.  Section  3  investigates  the  limi¬ 
tations  of  the  baseline  algorithms,  and  provides  extensions 
based  on  the  resource  balancing  heuristic.  A  description 
of  our  simulation  environment  is  given  in  Section  4.  The 
performance  results  of  our  new  load  balancing  methods  as 
compared  to  the  baseline  algorithms  is  also  summarized  in 
Section  4.  Finally,  Section  5  provides  conclusions  and  a 
brief  overview  of  our  current  work  in  progress. 

2.  Preliminaries 

Research  related  to  this  effort  is  drawn  from  single  server 
scheduling  in  the  presence  of  multiple  resource  require¬ 
ments  and  general  load  balancing  methods  for  homoge¬ 
neous  parallel  processing  systems. 

Recent  research  in  job  scheduling  for  a  single  server  has 
demonstrated  the  benefits  of  including  information  about 
the  memory  requirements  of  a  job  in  addition  to  its  CPU 
requirements  [13,  14].  The  generalized  A'-resource  sin¬ 
gle  server  scheduling  problem  was  studied  in  [10],  where 


it  was  shown  that  simple  backfill  algorithms  based  on 
multi-dimensional  packing  heuristics  consistently  outper¬ 
form  single-resource  algorithms,  with  increasing  K.  These 
efforts  all  suggest  that  the  local  JMS  at  each  site  should  be 
multi-resource  aware  in  making  its  scheduling  decisions. 
This  induces  requirements  on  the  global  LDS  to  provide  a 
job  mix  to  a  local  server  which  maximizes  the  success  rate 
of  the  local  server. 

The  general  goal  of  a  workload  distribution  system  is  to 
have  sufficient  work  available  to  every  computational  node 
to  enable  the  efficient  utilization  of  that  node.  A  central¬ 
ized  work  queue  provides  every  node  equal  access  to  all 
available  work,  and  is  generally  regarded  as  being  efficient 
in  achieving  this  goal.  Unfortunately,  the  centralized  work 
queue  is  generally  not  scalable  as  contention  for  the  sin¬ 
gle  queue  structure  increases  with  the  number  of  nodes.  In 
massively  parallel  processing  systems  where  the  number  of 
nodes  was  expected  to  reach  into  the  thousands,  this  was  a 
key  concern.  In  distributed  systems,  the  latency  for  query¬ 
ing  the  central  queue  potentially  increases  as  the  number  of 
nodes  is  increased.  Load  balancing  algorithms  attempt  to 
emulate  a  central  work  queue  by  maintaining  a  represen¬ 
tative  workload  across  a  set  of  distributed  queues,  one  per 
compute  node.  In  this  paper,  we  investigate  only  the  perfor¬ 
mance  of  load  balancing  across  distributed  queues. 

Classical  load  balancing  algorithms  are  typically  based 
on  a  load  index  which  provides  a  measure  of  the  workload 
at  a  node  relative  to  some  global  average,  and  four  policies 
which  govern  the  actions  taken  once  a  load  imbalance  is 
detected  [15],  The  load  index  is  used  to  detect  a  load  im¬ 
balance  state.  Qualitatively,  a  load  imbalance  occurs  when 
the  load  index  at  one  node  is  much  higher  (or  lower)  than 
the  load  index  on  the  other  nodes.  The  length  of  the  CPU 
queue  has  been  shown  to  provide  a  good  load  index  on  time- 
shared  workstations  when  the  performance  measure  of  in¬ 
terest  is  the  average  response  time  [2,  11].  In  the  case  of 
multiple  resources  (disk,  memory,  etc.),  a  linear  combina¬ 
tion  of  the  length  of  all  the  resource  queues  provided  an 
improved  measure,  as  job  execution  time  may  be  driven  by 
more  than  CPU  cycles  [5]. 

The  four  policies  that  govern  the  action  of  a  load  balanc¬ 
ing  algorithm  when  a  load  imbalance  is  detected  deal  with 
information,  transfer,  location,  and  selection.  The  informa¬ 
tion  policy  is  responsible  for  keeping  up-to-date  load  infor¬ 
mation  about  each  node  in  the  system.  A  global  information 
policy  provides  access  to  the  load  index  of  every  node,  at  the 
cost  of  additional  communication  for  maintaining  accurate 
information  [1], 

The  transfer  policy  deals  with  the  dynamic  aspects  of  a 
system.  It  uses  the  nodes’  load  information  to  decide  when 
a  node  becomes  eligible  to  act  as  a  sender  (transfer  a  job 
to  another  node)  or  as  a  receiver  (retrieve  a  job  from  an¬ 
other  node).  Transfer  policies  are  typically  threshold  based. 
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Thus,  if  the  load  at  a  node  increases  beyond  a  threshold  Ts, 
the  node  becomes  an  eligible  sender.  Likewise,  if  the  load 
at  a  node  drops  below  a  threshold  Tr ,  the  node  becomes  an 
eligible  receiver.  Load  balancing  algorithms  which  focus 
on  the  transfer  policy  are  described  in  [2,  15,  16]. 

The  location  policy  selects  a  partner  node  for  a  job  trans¬ 
fer  transaction.  If  the  node  is  an  eligible  sender,  the  location 
policy  seeks  out  a  receiver  node  to  receive  the  job  selected 
by  the  selection  policy  (described  below).  If  the  node  is 
an  eligible  receiver,  the  location  policy  looks  for  an  eligible 
sender  node.  Load  balancing  approaches  which  focus  on 
the  use  of  the  location  policy  are  described  in  [8,  9]. 

Once  a  node  becomes  an  eligible  sender,  a  selection  pol¬ 
icy  is  used  to  pick  which  of  the  queued  jobs  is  to  be  trans¬ 
ferred  to  the  receiver  node.  The  selection  policy  uses  several 
criteria  to  evaluate  the  queued  jobs.  Its  goal  is  to  select  a  job 
that  reduces  the  local  load,  incurs  as  little  cost  as  possible 
in  the  transfer,  and  has  good  affinity  to  the  node  to  which 
it  is  transferred.  A  common  selection  policy  is  latest-job- 
arrived  which  selects  the  job  which  is  currently  in  last  place 
in  the  work  queue. 

The  primary  difference  between  existing  load  balancing 
algorithms  and  our  global  load  distribution  requirements  is 
that  our  node  is  actually  a  multi-resource  server.  With  this 
extension  in  mind,  we  define  the  following  baseline  load 
balancing  algorithm: 

•  Load  Index.  The  load  index  is  based  on  the  average 
resource  requirements  of  the  jobs  waiting  in  the  queue 
at  a  given  server.  This  index  is  termed  the  resource 
average  (RA)  index.  For  our  multi-resource  server  for¬ 
mulation,  each  resource  requirement  for  a  job  in  the 
queue  represents  a  percentage  of  the  server  resource 
that  it  requires,  normalized  to  unity.  Therefore,  the  RA 
index  is  a  relative  index  which  can  be  used  to  compare 
the  loads  on  different  servers. 

•  Information  Policy.  As  the  information  policy  is  not 
the  subject  of  this  study,  we  choose  to  use  a  policy 
which  provides  perfect  information  about  the  state  of 
the  global  system.  We  assume  a  global  information 
policy  with  instantaneous  update. 

•  Transfer  Policy.  The  transfer  policy  is  threshold  based, 
since  it  has  been  shown  to  provide  robust  performance 
across  a  range  of  load  conditions.  A  server  becomes 
a  sender  when  its  load  index  grows  above  the  global 
load  average  by  a  threshold,  Ts.  Conversely,  a  server 
becomes  a  receiver  when  its  load  index  falls  below  the 
global  average  by  a  threshold  Tr. 

•  Location  Policy.  The  location  policy  is  also  not  the 
subject  of  this  study.  Therefore,  we  use  a  simple  lo¬ 
cation  policy  which  heuristically  results  in  fast  con¬ 
vergence  to  a  balanced  load  state.  In  the  event  that 


the  transfer  policy  indicates  that  a  server  becomes  a 
sender,  the  location  policy  selects  the  server  which  cur¬ 
rently  has  the  least  load  to  be  the  receiver.  However, 
the  selected  server  must  also  be  an  eligible  receiver, 
meaning  that  it  currently  has  a  load  which  is  Tr  below 
the  global  average.  Conversely,  if  the  server  is  a  re¬ 
ceiver,  the  location  policy  selects  the  server  which  cur¬ 
rently  has  the  highest  load  that  is  Ts  above  the  global 
average.  If  no  eligible  partner  is  found,  the  load  bal¬ 
ancing  action  is  terminated. 

•  Selection  Policy.  A  latest-job-arrived  selection  policy 
(LSP)  is  used  to  select  a  job  from  the  sending  server 
to  be  transferred  to  the  receiving  server.  This  selec¬ 
tion  policy  generally  performs  well  with  respect  to 
achieving  a  good  average  response  time,  but  suffers 
from  some  jobs  being  moved  excessively.  Therefore, 
each  job  keeps  a  job  transfer  count  which  records  the 
number  of  times  it  has  been  moved.  When  this  count 
reaches  a  threshold  Tc,  the  job  is  no  longer  eligible  to 
be  selected  for  a  transfer.  Jobs  which  are  already  exe¬ 
cuting  are  excluded  from  being  transferred. 

The  sender  initiated  (SI),  receiver  initiated  (RI),  and 
symmetrically  initiated  (SY)  algorithm  variants  are  gener¬ 
ated  using  a  transfer  policy  which  triggers  a  load  balancing 
action  on  Ts,  T, ,  or  both,  respectively.  All  baseline  variants 
use  the  RA  load  index  and  the  LSP  job  selection  policy. 

3.  Multi-Resource  Aware  Load  Balancing  Poli¬ 
cies 

In  this  section,  we  first  discuss  the  limitations  of  the  re¬ 
source  average  load  index,  RA,  and  the  latest-job-arrived 
selection  policy,  LSP,  of  the  baseline  load  balancing  algo¬ 
rithms  for  the  heterogeneous  multi-resource  servers  prob¬ 
lem.  We  provide  an  example  which  illustrates  where  these 
naive  strategies  can  fail  to  match  the  workload  to  the 
servers,  resulting  in  local  workloads  which  exhibit  a  re¬ 
source  imbalance.  We  then  provide  extensions  to  the  load 
index  and  the  job  selection  policy  which  strive  to  balance 
the  resource  usage  at  each  server. 

3.1.  Limitations  of  RA  and  LSP 

The  resource  average  load  index,  RA,  and  the  latest-job- 
arrived  job  selection  policy,  LSP,  in  the  baseline  algorithm 
fail  in  the  multi -resource  server  load  balancing  context.  The 
following  discussion  gives  an  example  of  these  failures  and 
provides  some  insight  into  possible  new  methods.  Our  new 
methods  will  be  further  discussed  in  Section  3.2. 

In  past  research,  the  index  used  to  measure  the  load  on 
a  server  with  respect  to  multiple  resources  consisted  of  a 
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linear  combination  or  an  average  of  the  resource  require¬ 
ments  for  the  actively  running  jobs  in  a  time-shared  sys¬ 
tem.  A  corresponding  index  which  may  be  applied  to  batch 
queued  space-shared  systems  is  to  use  the  average  of  the 
total  resource  requirements  of  the  jobs  waiting  in  the  wait 
queue.  However,  this  may  not  always  indicate  a  system  state 
where  there  exists  a  resource  imbalance,  that  is,  the  total 
job  requirements  for  one  resource  exceeds  the  requirements 
for  the  other  resources.  Essentially,  a  server  with  a  mis¬ 
matched  work  mix  will  be  forced  to  leave  some  resources 
idle  while  other  resources  are  fully  utilized,  resulting  in  an 
inefficient  use  of  the  system  as  a  whole. 

Figure  2(a)  depicts  the  state  of  the  job  ready  queues, 
RQo  and  RQi  for  a  two-server  system.  So  and  Si.  As¬ 
sume  that  each  server  has  three  resources,  Sf,  Sj ,  and  Sj, 
and  that  the  configuration  for  the  two  servers  is  identical, 
Sq  =  Si,  Sq  =  5j,  and  5q  =  Sj.  Each  of  the  two  ready 
queues  currently  has  two  jobs.  The  job  which  arrived  lat¬ 
est  at  each  server  is  on  the  top  of  the  ready  queue  for  that 
server.  For  example,  the  latest  arriving  job  ,  .Jg,  in  RQo  has 
the  resource  requirements  .7/  =  2,  J\  =  3,  and  Jj  =  2. 
Note  that  the  resource  requirements  for  a  job  are  given  as 
a  percentage  of  the  total  available  in  the  server.  The  total 
workload  for  each  resource,  k,  in  a  given  server.  Si,  is  de¬ 
noted  as 

Wj  =  )>  0  <  i  <  S,  0  <k<  K. 

Jj&RQi 

The  resource  average  load  index  for  a  given  server.  Si,  is 
then  given  by 

RA;  =  Avg(Wk ).  0  <  k  <  K. 

In  this  example,  K  =  3  and  RA0  =  RA\  =  4. 

The  third  queue  in  Figure  2(a),  RQ  a  vg ,  represents  the 
global  average  workload  for  each  resource  in  RQo  and 
RQi .  The  global  average  workload  for  resource  k,  is  then 
given  by 

Wlg=Avg(Wf),  0  <  i  <  S. 

Here,  5  =  2  and  W%vg  =  W\vg  =  W\vg  =  4,  meaning 
that  on  average,  each  RQi  has  a  total  requirement  of  4  per¬ 
cent  for  each  resource.  The  global  resource  average  load 
index  is  simply 


requirement  for  resource  Sq  than  for  resources  5 '§  and  Sjj . 
The  result  is  that  So  will  probably  be  unable  to  fully  utilize 
resources  Sg  and  Sq  as  resource  5q  becomes  the  bottleneck. 
Conversely,  the  job  mix  in  RQ\  has  a  higher  requirement 
for  resources  5°  and  Sj  than  for  Sj ,  resulting  in  an  ineffi¬ 
cient  use  of  resource  Sj .  Therefore,  the  workload  at  each 
server  suffers  from  a  resource  imbalance. 

In  order  to  detect  this  problem,  we  define  a  second  load 
index,  called  resource  balance  (RB),  which  measures  the 
resource  imbalance  at  a  given  server  or  globally  across  the 
system.  Namely,  for  server  S) .  0  <  i  <  5, 


MaxjWj) 
1  Avg(Wj) 

Similarly, 

^  Max(W\vg) 
A vg(Wkvg ) 


Max(Wj) 

RA, 


0  <  k  <  K. 


Max{Wkvg ) 
RA^\Vg 


0  <  k  <  K. 


Heuristically,  the  RB  index  of  a  server  measures  how  bal¬ 
anced  the  job  mix  is  with  respect  to  their  different  re¬ 
source  requirements.  If  the  total  resource  requirements  are 
all  the  same,  then  the  local  RB  measure  is  unity,  since 
M ax ( Wk )  =  Avg(Wj)  .  This  corresponds  to  the  case 
where  the  workload  is  matched  to  the  server.  The  global 
RB  is  a  measure  of  how  well  the  total  work  in  the  system 
matches  the  capabilities  of  all  the  servers  in  the  system.  The 
goal  of  the  load  balancing  algorithm  is  to  move  each  server 
towards  this  global  balanced  resource  level.  In  Figure  2(a), 
RBg  =  6/4  or  1.5,  while  RBi  =  5/4  or  1.25.  Since 
RB  =  4/4  or  1.0,  the  two  servers  recognize  the  existence 
of  a  resource  imbalanced  state. 

Once  a  resource  imbalance  is  detected,  the  load  bal¬ 
ancing  policies  must  actively  correct  the  imbalance.  Fig¬ 
ure  2(b)  shows  the  result  of  using  the  ESP  policy  to  ad¬ 
just  the  resource  imbalance.  Server  So  sends  its  latest 
job  to  Si,  while  5i  sends  its  latest  job  to  Sq.  Note  that 
the  resource  balance  index  improves  on  both  servers,  with 
RBq  =  4/3.33  or  1.2,  while  RBi  =  5/4.66  or  1.07.  How¬ 
ever,  the  resource  balance  could  have  been  improved  even 
further,  as  shown  in  Figure  2(c),  by  transferring  the  jobs 
which  best  balance  the  workload  at  both  servers.  We  refer 
to  this  heuristic  policy  as  the  balanced  job  selection  policy 
or  BSP. 


RA  =  Avg( W%vg),  0  <  k  <  Ii, 


3.2.  Resource  Balancing  Algorithms 


which  in  this  example  is  RA  =  4.  Server  5,;  is  defined  to  be 
in  a  load  balanced  state  as  long  as  RA  *  (1  —  Tx)  <  RA.i  < 
RA  *  (1  +  Tx),  where  Tx  is  the  transfer  policy  threshold,  as 
defined  in  Section  2.  Since  RAg  =  RA\  =  RA,  the  system 
is  believed  to  be  in  a  load  balanced  state. 

Even  though  the  RA  index  indicates  a  balanced  load,  it  is 
clear  from  Figure  2(a)  that  the  job  mix  in  RQo  has  a  higher 


In  the  following  discussion,  we  extend  the  baseline  load 
balancing  algorithm  with  the  heuristic  RB  load  index  and 
the  BSP  job  selection  policy.  In  general,  the  goal  of  these 
extensions  is  to  move  the  system  to  a  state  where  the  load  is 
balanced  across  the  servers  and  the  job  mix  at  each  server 
matches  the  resource  capabilities  provided  by  that  server. 
These  extensions  are  described  below. 
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(a)  Comparison  of  RA  and  RB  Load  Index  Measures 


(b)  Result  of  Latest  Job  Selection  Policy  (LSP) 


(c)  Result  of  Balance  Job  Selection  Policy  (BSP) 

Figure  2.  Limitations  of  RA  and  LSP 


Sender  Initiated,  Balanced  Selection  Policy:  SI_BSP. 

The  baseline  sender  initiated  algorithm,  SI,  is  extended  to 
SLBSP  by  modifying  the  selection  policy  as  follows.  The 
fact  that  the  load  balancing  action  was  triggered  by  the  con¬ 
dition  that  the  load  index,  RA,  of  a  given  server  was  above 
the  global  average  implies  that  it  has  more  work  than  at  least 
one  other  server.  Thus,  this  heavily  loaded  server  needs  to 
transfer  work  to  another  server.  The  BSP  policy  selects  the 
job  for  transfer  (out)  which  results  in  the  best  resource  bal¬ 
ance  of  the  local  queue.  Note  that  transferring  a  job  may 
actually  worsen  the  resource  imbalance,  but  we  proceed 
nonetheless  so  that  the  overall  excess  workload  can  be  re¬ 
duced.  Also,  the  resource  balance  at  the  receiving  server 
may  worsen  as  well.  However,  the  receiving  server  cur¬ 
rently  has  a  workload  shortage,  so  it  may  be  executing  less 
efficiently  anyway. 

Sender  Initiated,  RB  Index,  Balanced  Selection  Pol¬ 
icy:  SI_RB_BSP.  The  SI_RB_BSP  algorithm  extends  the 
SLBSP  algorithm  by  including  the  RB  load  index,  and  mod¬ 
ifying  the  transfer  and  selection  policies  as  follows.  First, 
the  transfer  policy  triggers  a  load  balancing  action  based  on 
RA  or  RB.  If  the  action  is  based  on  RA,  SI_RB _BSP  pro¬ 
ceeds  as  SI_BSP.  However,  if  the  action  is  based  only  on 
RB,  the  selection  policy  is  further  modified  over  that  used 
for  SI_BSP.  The  job  which  positively  improves  the  resource 
balance  of  the  local  queue  the  most  is  selected  for  transfer 
(out).  If  no  such  job  is  found,  no  action  occurs. 

Receiver  Initiated,  Balanced  Selection  Policy:  RI_BSP. 

The  baseline  receiver  initiated  algorithm,  RI,  is  extended  to 
RLBSP  in  a  fashion  complementary  to  SIJ3SP. 

Receiver  Initiated,  RB  Index,  Balanced  Selection  Pol¬ 
icy:  RI_RB_BSP.  The  RI_RB_BSP  algorithm  extends 
the  RLBSP  algorithm  in  a  fashion  complementary  to 
SI_RB_BSP. 

Symmetrically  Initiated,  Balanced  Selection  Policy: 
SY_BSP.  The  baseline  symmetrically  initiated  algorithm, 
SY,  is  extended  to  SYJ3SP  as  follows.  If  the  transfer  pol¬ 
icy  triggers  a  send  action,  SYJ3SP  proceeds  as  SIJBSP.  Al¬ 
ternatively,  if  the  transfer  policy  triggers  a  receive  action, 
SYJ3SP  proceeds  as  RIJ3SP. 

Symmetrically  Initiated,  RB  Index,  Balanced  Selection 
Policy:  SY_RB_BSP.  The  SY_RB_BSP  algorithm  ex¬ 
tends  the  SYJ3SP  algorithm  as  follows.  If  the  action  is 
based  on  RA,  SY_RB_BSP  proceeds  as  SYJ3SP.  However, 
if  the  action  is  based  only  on  RB,  then  SY_RB_BSP  per¬ 
forms  both  send  and  receive  actions  using  methods  identi¬ 


cal  to  SI_RB_BSP  and  RI_RB_BSP.  Heuristically,  this  main¬ 
tains  the  RA  index  but  improves  the  RB  index. 

4.  Experimental  Results 

The  baseline  and  extended  load  balancing  algorithms 
were  implemented  on  a  simulated  system  that  is  described 
in  Section  4.1.  The  experimental  results  are  summarized  in 
Section  4.2. 

4.1.  System  Model 

The  simulation  system  follows  the  general  form  of  Fig¬ 
ure  1 .  The  server  model,  workload  model,  and  performance 
metrics  are  discussed  below. 

Server  Model.  A  system  with  16  servers  was  used  for  the 
current  set  of  experiments.  A  server  model  is  specified  by 
the  amount  of  each  of  the  K  resource  types  it  contains  and 
the  choice  of  the  local  scheduler.  For  all  simulations,  the  lo¬ 
cal  scheduler  uses  a  backfill  algorithm  with  a  resource  bal¬ 
ancing  job  selection  criteria  [10].  To  our  knowledge,  this 
is  the  best  performing  local  scheduling  algorithm  for  the 
multi-resource  single  server  problem.  At  this  point,  we  as¬ 
sume  that  the  jobs  are  rigid,  meaning  that  they  must  receive 
the  required  resources  before  they  can  execute.  We  also 
assume  that  the  execution  time  of  a  job  is  the  same  on  any 
server.  Simulation  results  are  reported  for  a  value  of  K  =  8. 

Two  independent  parameters  were  used  to  specify  the 
degree  of  heterogeneity  across  the  servers  in  the  simulated 
system.  First,  within  a  single  server,  the  server  resource 
correlation ,  Src,  parameter  specifies  how  the  resources  of  a 
given  server  are  balanced.  This  represents  the  intra-server 
resource  heterogeneity  measure.  For  example,  assume  each 
server  has  two  resources,  CPUs  and  memory.  If  a  cor¬ 
relation  value  of  about  one  were  specified,  then  a  server 
with  a  large  memory  would  also  have  a  large  number  of 
CPUs.  Conversely,  if  a  correlation  value  of  about  negative 
one  were  used,  then  a  server  with  a  large  memory  would 
probably  have  a  low  number  of  CPUs.  Finally,  a  correla¬ 
tion  value  near  zero  implies  that  the  resource  sizes  within  a 
given  server  are  unrelated.  The  value  of  the  resource  cor¬ 
relation  ranged  from  0.15  to  0.85  in  the  simulations  (our 
simulator  is  capable  of  generating  Srr  values  in  the  range 
-1.0 /[K  -  1)  <  Src  <  1.0). 

The  second  parameter  is  the  server  resource  variance, 
Srv ,  which  is  used  to  describe  range  of  sizes  for  a  single 
resource  which  may  be  found  across  all  of  the  servers.  This 
represents  the  inter-server  heterogeneity  measure.  Again, 
a  resource  variance  about  one  implies  that  the  number  of 
CPUs  found  in  server  Si  will  be  approximately  the  same  as 
the  number  of  CPUs  found  in  server  Sj  for  0  <  i,j  <  S. 
In  general,  a  resource  variance  of  Srv  =  V  implies  that 
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the  server  5,  with  the  largest  amount  of  a  resource  k  has 
V  times  as  much  of  that  resource  as  the  server  Sj  which 
has  the  smallest  amount  of  that  resource.  All  other  servers 
have  some  amount  of  resource  k  between  Sf  and  Sl: .  The 
value  of  the  resource  variance  ranged  from  1 .2  to  8.0  for  our 
experiments. 

Workload  Model.  The  two  main  aspects  of  the  simulated 
workload  are  the  generation  of  multi -resource  jobs  and  the 
job  arrival  rate.  Recent  studies  on  workload  models  have  fo¬ 
cused  primarily  on  a  single  resource  —  the  number  of  CPUs 
that  a  job  requires.  Two  general  results  from  these  studies 
show  that  the  distribution  of  CPU  requirements  is  gener¬ 
ally  hyperexponential,  but  with  strong  discrete  components 
at  powers  of  two  and  squares  of  integers  [3],  An  addi¬ 
tional  study  investigated  the  distribution  of  memory  require¬ 
ments  on  the  1024  processor  CM-5  at  Los  Alamos  National 
Laboratory.  The  conclusion  was  that  memory  requirements 
are  also  hyperexponentially  distributed  with  strong  discrete 
components.  Additionally,  there  was  a  weak  correlation  be¬ 
tween  the  CPU  and  memory  requirements  for  the  job  stream 
studied  [4], 

We  generalize  these  results  to  a  A' -resource  workload  as 
follows.  The  multiple  resource  requirements  for  a  job  in 
the  job  stream  are  described  by  two  parameters.  The  kth 
resource  requirement  for  job  j,  jj\  is  drawn  from  a  hyper¬ 
exponential  distribution  with  mean  A"/..  Additionally,  the 
correlation  between  resource  requirements  within  a  single 
job,  Jrc  is  also  specified.  A  single  set  of  workload  parame¬ 
ters  was  used  for  all  of  the  initial  simulations  reported  here, 
in  which  X =  0.15,0  <  k  <  K,  and  the  resource  cor¬ 
relation  .Jrc  =  0.25.  Essentially,  the  average  job  requires 
15%  of  each  resource  in  an  average  server,  and  its  relative 
resource  requirements  are  near  random. 

Figure  3(a)  shows  the  single  resource  probability  distri¬ 
bution  used  for  the  workload.  Note  that  the  probability  for 
small  resource  requirements  is  reduced  over  a  strictly  expo¬ 
nential  distribution.  We  justify  this  modification  by  noting 
that  small  jobs  are  probably  not  good  candidates  for  load 
balancing  activity  as  they  do  not  impact  the  local  job  sched¬ 
uler  efficiency  significantly  (except  to  improve  it).  Fig¬ 
ure  3(b)  shows  the  joint  probability  distribution  for  a  dual 
resource  (. K  =  2)  system.  In  general,  the  joint  probability 
distribution  shown  in  Figure  3(b)  is  nearly  identical  for  all 
pairs  (i,j),  0  <  i,j  <  K,  of  resources  in  the  job  stream. 
This  workload  model  has  also  been  used  to  study  multi¬ 
resource  scheduling  on  a  single  server  [10]. 

The  job  arrival  rate  generally  affects  the  total  load  on  the 
system.  A  high  arrival  rate  results  in  a  large  number  of  jobs 
being  queued  at  each  server,  while  a  low  arrival  rate  reduces 
the  number  of  queued  jobs.  For  our  initial  simulations,  we 
selected  an  arrival  rate  that  resulted  in  an  average  of  32  jobs 
per  server  in  the  system.  As  each  job  arrives,  it  is  sent  to  a 


server  selected  randomly  from  a  uniform  distribution  rang¬ 
ing  from  0  to  S  —  1.  A  final  assumption  is  that  the  nature 
of  the  workload  model  impacts  only  the  absolute  values  of 
the  system  performance,  not  the  relative  performance  of  the 
algorithms  under  study. 

Performance  Metrics.  A  single  performance  metric, 
throughput,  is  our  current  method  for  evaluating  these  al¬ 
gorithms.  Throughput  is  measured  as  the  total  elapsed  time 
from  when  the  first  job  arrives  to  when  the  last  job  departs. 

4.2.  Simulation  Results 

Our  initial  simulation  results  are  depicted  in  Figures 
4(a)— (f).  Recall  that  load  balancing  algorithms  essentially 
try  to  mimic  a  central  work  queue  from  which  any  server 
can  select  jobs  as  its  resources  become  available.  Therefore, 
the  performance  results  for  the  load  balancing  algorithms 
are  normalized  against  the  results  of  a  system  with  a  central 
work  queue.  For  each  graph  in  the  figure,  the  x  axis  rep¬ 
resents  the  server  resource  variance  parameter,  Srv,  as  de¬ 
scribed  previously,  while  the  y  axis  represents  the  through¬ 
put  of  the  algorithms,  normalized  to  the  throughput  of  the 
central  queue  algorithm.  The  following  paragraphs  summa¬ 
rize  these  results. 

Impact  of  the  Resource  Balancing  Policies.  Figures 
4(a)-(c)  depict  the  performance  of  the  sender  initiated,  re¬ 
ceiver  initiated,  and  symmetrically  initiated  baseline  and 
extended  algorithms,  normalized  to  the  performance  of  the 
central  queue  algorithm.  For  these  experiments,  K  =  8  and 
Src  =  0.50  (resources  within  a  server  are  very  weakly  cor¬ 
related).  In  comparing  the  performance  of  the  baseline  and 
the  extended  algorithms,  we  see  that  the  extended  variants 
consistently  out-perform  the  baseline  algorithm  from  which 
they  were  derived.  The  addition  of  the  intelligent  job  se¬ 
lection  policy,  BSP,  provides  a  5-10%  gain  in  the  SIJ3SP, 
RIJ3SP,  and  SYJ3SP  algorithms  over  the  SI,  RI,  and  SY 
algorithms,  respectively.  Moreover,  the  addition  of  the  RB 
load  index  and  associated  transfer  policy  further  increases 
these  gains  for  SI  JIB  .BSP,  RI_RB_BSP,  and  SY.RB.BSP. 

Effects  of  Server  Resource  Correlation,  Src.  The  jobs 
which  arrive  at  each  server  may  or  may  not  have  a  natu¬ 
ral  affinity  for  that  server.  For  example,  if  a  server  has  a 
large  memory  and  a  few  CPUs,  a  job  which  is  memory  in¬ 
tensive  has  a  high  affinity  for  that  server.  However,  a  job 
which  is  CPU  intensive  has  a  low  affinity  to  that  server. 
For  a  job  stream  with  a  fixed  intra-job  resource  correla¬ 
tion,  Jrc,  the  probability  that  an  arriving  job  has  good  affin¬ 
ity  to  a  server  increases  as  Src  increases.  A  larger  natural 
affinity  increases  the  packing  efficiency  of  the  local  sched¬ 
ulers,  improving  the  throughput.  Figures  4(d)— (f)  compare 
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Figure  3.  Multi-Resource  Workload  Model 


the  performance  of  the  RI_RB_BSP,  SLR B_BSP,  and  the 
SY_RB_BSP  algorithms,  over  the  range  of  server  resource 
correlation  values,  Src  =  {0.15.0.50,0.70}.  Generally,  as 
the  value  of  Src  increases,  the  performance  of  the  load  bal¬ 
ancing  algorithms  also  improve,  due  to  an  increased  proba¬ 
bility  of  natural  affinity. 

The  S1_RB_BSP  algorithm  performs  slightly  better  than 
RI_RB_BSP  at  low  values  of  Src  as  seen  in  Figure  4(d). 
However,  RI_RB _BSP  begins  to  outperform  SI_RB_BSP  at 
higher  values  of  Src,  as  seen  in  Figures  4(e)  and  4(f).  At  low 
values  of  Src ,  the  SI  variant  can  actively  transfer  out  jobs 
with  low  affinity,  which  occur  with  high  probability,  while 
the  RI  variant  can  only  try  to  correct  the  affinity  of  their  to¬ 
tal  workload.  Higher  values  of  Srv  magnify  this  problem. 
Therefore,  the  performance  advantage  goes  to  the  SI  vari¬ 
ant.  For  higher  values  of  Src,  the  probability  of  good  job- 
server  affinity  is  also  higher.  When  accompanied  by  higher 
Srv,  the  system  may  be  thought  of  as  having  some  larger 
servers  and  some  smaller  servers,  with  good  job  affinity  to 
any  server.  In  this  case,  the  throughput  of  the  system  is 
driven  by  the  efficiency  of  the  larger  servers.  In  the  SI  vari¬ 
ant,  the  smaller  servers  will  tend  to  initiate  load  balancing 
actions,  by  sending  work  to  the  larger  servers.  So  while  the 
smaller  servers  may  execute  efficiently,  the  larger  servers 
may  not.  However,  in  the  RI  variant,  the  larger  servers  will 
tend  to  initiate  load  balancing,  and  intelligently  select  which 
work  to  receive  from  the  smaller  servers.  Now,  the  larger 
servers  will  tend  to  execute  more  efficiently.  For  this  rea¬ 
son,  the  performance  advantage  goes  to  the  RI  variant. 

Impact  of  Server  Resource  Variation,  Srv.  As  the  re¬ 
source  variation,  Srv,  increases  in  the  graphs  of  Figure  4, 
the  throughput  of  the  load  balancing  algorithms  drops  rela¬ 
tive  to  the  central  queue  algorithm.  This  is  due  to  the  fact 


that  the  average  job  size  (size  of  the  jobs  resource  require¬ 
ments)  is  not  taken  into  account  when  selecting  jobs  for 
transfer.  At  higher  server  resource  variances,  some  servers 
have  a  very  small  amount  of  one  or  more  resources.  How¬ 
ever,  the  average  job  size  ending  up  on  the  servers  with 
small  resource  capacities  is  the  same  as  those  ending  up 
on  the  larger  servers.  The  small  size  of  the  resources  in 
these  servers,  relative  to  the  average  resource  requirement 
of  the  arriving  jobs,  causes  packing  inefficiencies  by  the  lo¬ 
cal  scheduler,  due  to  job  size  granularity.  In  the  case  of  a 
centralized  queue,  the  servers  with  small  resource  capacities 
are  more  likely  to  find  jobs  with  smaller  resource  require¬ 
ments.  In  short,  simply  balancing  the  workload  resource 
characteristics  is  not  sufficient.  Other  workload  character¬ 
istics  must  also  be  emulated  in  the  local  queues,  such  as  the 
average  job  requirements  relative  to  the  server  resource  ca¬ 
pacities.  This  is  a  topic  in  our  current  work  in  progress  and 
is  briefly  discussed  in  Section  5. 


Central  Queue  vs.  Load  Balancing.  A  final  observation 
may  be  drawn  from  the  graphs  in  Figure  4.  Even  when 
the  servers  are  all  similarly  configured  (e.g.  Srv  ~  1  and 
Src  ~  1)-  there  is  a  consistent  performance  gap  of  15%  for 
all  baseline  and  extended  load  balancing  algorithms  with  re¬ 
spect  to  the  central  queue  algorithm.  This  is  due  to  the  fact 
that  even  if  the  load  balancing  algorithms  are  successful  in 
balancing  the  load,  the  local  scheduler  at  each  server  may 
not  be  able  to  find  a  job  in  its  local  queue  to  fill  idle  re¬ 
sources,  even  if  such  a  job  exists  in  the  queue  of  a  different 
server.  Closing  this  gap  is  the  subject  of  our  current  work 
and  is  briefly  discussed  in  Section  5. 
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Server  Resource  Variance,  Srv 
(b)  Sender  Initiated  Variants 
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Server  Resource  Variance,  Srv 
(d)  Server  Resource  Correlation:  Src=0.15 
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Server  Resource  Variance,  Srv 
(e)  Server  Resource  Correlation:  Src=0.50 
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(c)  Symmetrically  Initiated  Variants  (f)  Server  Resource  Correlation:  Src=0.70 

Figure  4.  Baseline  and  Extended  Algorithm  Performance  Comparison 
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5.  Summary  and  Work  in  Progress 

In  this  paper,  we  defined  a  workload  distribution  prob¬ 
lem  for  a  computational  grid  with  near-homogeneous  multi¬ 
resource  servers.  First,  servers  in  the  grid  have  multiple 
resource  capacities,  and  jobs  submitted  to  the  grid  have  re¬ 
quirements  for  each  of  those  resources.  Additionally,  the 
servers  are  homogeneous  in  that  any  job  submitted  to  the 
grid  can  be  executed  by  any  of  the  servers,  but  heteroge¬ 
neous  in  their  various  resource  configurations.  We  next 
investigated  a  load  balancing  approach  to  workload  distri¬ 
bution  for  this  grid.  We  showed  how  previous  baseline 
load  balancing  policies  for  single  resource  systems  failed 
to  maintain  a  workload  at  each  server  which  had  a  good 
affinity  towards  that  server.  We  then  proposed  two  orthog¬ 
onal  extensions  based  on  the  concept  of  resource  balanc¬ 
ing.  The  basic  idea  of  resource  balancing  is  that  the  local 
scheduler  is  more  effective  in  utilizing  the  resources  of  the 
local  server,  if  the  total  relative  resource  requirements  of 
all  jobs  in  a  local  work  queue  match  the  relative  capacities 
of  the  server.  Our  simulation  results  show  that  our  policy 
extensions  provided  a  consistent  5-15%  increase  in  system 
throughput  performance  over  the  baseline  load  balancing  al¬ 
gorithms. 

However,  there  is  still  significant  room  for  improvement 
in  the  workload  distribution  approach.  First,  as  the  re¬ 
source  variance  between  servers  grows,  additional  work¬ 
load  characteristics,  beyond  the  total  resource  balance,  must 
be  taken  into  account  when  evaluating  the  workload  for  a 
given  server.  Specifically,  we  noted  that  the  granularity 
of  jobs  in  a  local  queue  impacts  the  performance  of  the 
smaller  servers.  Intuitively,  small  jobs  should  be  sent  to 
small  servers,  and  large  jobs  should  be  sent  to  large  servers. 
Here,  a  large  job  is  one  that  generally  has  large  resource 
requirements,  and  a  large  server  is  one  that  generally  has 
large  resource  capacities.  Note  that  the  size  of  a  job  is  rela¬ 
tive  to  the  size  of  the  server  to  which  it  is  being  compared. 
Our  current  work  in  progress  is  investigating  refinements  to 
the  load  balancing  policies  which  improve  the  affinity  of  the 
local  workload  to  the  local  server.  Note  that  these  investi¬ 
gations  apply  to  single  resource  servers  as  well. 

Second,  there  is  a  persistent  performance  gap  between 
a  central  queue  approach  to  workload  distribution  and  our 
load  balancing  algorithms.  Our  conjecture  is  that  even  if  the 
load  is  perfectly  balanced,  restricting  a  server.  Si,  to  execute 
jobs  only  from  its  local  queue  will  increase  the  percentage 
of  time  that  some  of  S',’ s  resources  remain  idle,  when  there 
may  be  a  job  in  the  queue  of  a  different  server,  Sj ,  which 
would  fit  in  to  the  idle  resources  of  server  S,.  These  effects 
were  noted  in  our  simulations  in  that  even  when  the  servers 
were  all  nearly  identical,  and  an  equal  load  was  being  de¬ 
livered  to  each  server,  the  system  throughput  was  still  sig¬ 
nificantly  below  the  performance  of  the  central  queue  algo¬ 


rithm.  Load  balancing  schemes  were  limited  to  about  85% 
of  the  throughput  of  the  central  queue  scheme  at  all  tested 
values  of  Srv  and  Src,  as  seen  in  Figures  4(a)-(f). 

We  are  further  motivated  to  look  at  a  more  central¬ 
ized  approach  by  real-world  computational  grids,  such  as 
NASA’s  Information  Power  Grid  (IPG)  [6],  The  current 
implementation  of  the  IPG  uses  services  from  the  Globus 
toolkit  to  submit  jobs,  query  their  status,  and  query  the  state 
of  the  grid  resources.  Globus  uses  a  centralized  directory 
structure,  the  Metacomputing  Directory  Service  (MDS)  to 
store  information  about  the  status  of  the  metacomputing 
environment  and  all  jobs  submitted  to  the  grid.  Informa¬ 
tion  in  the  MDS  is  used  to  assist  in  the  placement  of  new 
jobs  onto  servers  with  appropriate  resources  within  the  grid. 
While  this  approach  is  currently  being  used  in  the  IPG, 
there  are  questions  about  the  scalability  of  such  a  central¬ 
ized  structure.  For  example,  can  a  central  structure  han¬ 
dle  hundreds  of  sites  and  thousands  of  jobs?  How  about 
fault  tolerance?  Our  current  work  in  progress  is  therefore 
investigating  compromises  between  a  single  central  queue 
and  completely  distributed  queues.  The  general  concept  is 
to  keep  work  close  to  the  servers  where  it  will  most  likely 
execute,  and  move  work  to  a  specific  server  when  needed. 
Recent  research  in  dynamic  matching  and  scheduling  for 
heterogeneous  computing  systems  use  similar  approaches, 
along  with  heuristics  for  matching  a  job  to  idle  server  re¬ 
sources  [12],  Our  work  in  progress  attempts  to  combine  the 
centralized  nature  of  current  mapping  approaches  with  our 
resource-balanced  workload  affinity  approach. 
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